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Abstract 

Stackelberg Pricing Games is a two-level combinatorial pricing problem studied in the Economics, 
Operation Research, and Computer Science communities. In this paper, we consider the decade-old 
shortest path version of this problem which is the first and most studied problem in this family. 

The game is played on a graph (representing a network) consisting of fixed cost edges and pricable 
or variable cost edges. The fixed cost edges already have some fixed price (representing the competitor's 
prices). Our task is to choose prices for the variable cost edges. After that, a client will buy the cheapest 
path from a node s to a node t, using any combination of fixed cost and variable cost edges. The goal is 
to maximize the revenue on variable cost edges. 

In this paper, we show that the problem is hard to approximate within 2 — e, improving the previous 
APX-hardness result by Joret [to appear in Networks]. Our technique combines the existing ideas with a 
new insight into the price structure and its relation to the hardness of the instances. 

1 Introduction 

A newly startup company has just acquired some links in a network. The company wants to sell these links 
to a particular client, who will buy a cheapest path from a node s to a node t. However, this company is 
not alone in the market: there are other companies already in the market owning some links with some 
fixed prices. The goal of this new company is to price its links to maximize its profit, having the complete 
knowledge of the network and knowing that the client will buy the cheapest s-t path (which may consist of 
links from many companies). Of course, if they price a link too high, the client will switch to other links 
and if they price a link too low then they unnecessarily reduce their profit. 

This problem is called the Stackelberg Shortest Path Game (StackSP) and can be defined formally as 
follows. We are given a directed graph G = (V, E), a source vertex s and a sink vertex t. The set E of edges 
is partitioned into two sets: Et, the set affixed cost edges, and E v , the set of pricable or variable cost edges. 
Each edge e in Ef already has some price p(e). Our task is to set a price p(e) to each variable cost edge e. 
Once we set the price, the client will buy a shortest path from s to t (i.e., a path P such that J2 e epP( e ) * s 
minimized). Our goal is to maximize the profit; i.e., maximize J2e<=PnE v P( e ) where P is the path bought 
by the client. Throughout, we let m denote the number of variable cost edges. It is usually assumed that if 
there are many shortest paths, the client will buy the one that maximizes our profit. 

Due to its connection to road network tolling and bilevel programming, there is an enormous effort 
in understanding the problem by means of bilevel programming |[24l [121 [141 fl9l I2T1 l20l [T3l 01, finding 
polynomial-time solvable cases ||24l 122 [HI ISIl HH1 S 0, solving the problem by heuristics |[T6l[T5l . and 
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approximating the solution |[26l l6l l23ll28l . In this paper, we focus on approximability of this problem. In 
this realm, StackSP is the first and the most studied problem in the growing family of one-follower (i.e., 
one client) Stackelberg network pricing games |[26ll6l l231l28l l2ll8ll9ll5l. 

The Stackelberg pricing problems belong to the class of two-player two-level optimization problems 
which is a subclass of the bilevel linear programming. These problems have a rather strange structure, 
and this makes the standard approximation techniques such as linear programming seemingly inapplicable. 
For example, a natural LP formulation for StackSP (and also another version called StackMST) has an 
integrality gap of fi(logm). Moreover, by using the most (and probably the only) natural upper bound for 
OPT, one cannot obtain approximation factor better than O(logra) |[26l . so the line of attacks considered 
in |[26l and (H cannot be pushed any further. 

Proving the hardness of this problem seems to have an equally big obstacle. In fact, the progress on the 
hardness side for the family of Stackelberg pricing problems stops at small constant hardness (APX-hardness 
in Il23l l8l and only NP-hardness in Q). Moreover, a reduction from Unique Coverage problem ifTTll . which 
proved useful for many pricing problems (including StackSP with multiple followers) apparently does not 
apply here. In particular, for StackSP, only NP-hardness, strong NP-hardness, and APX-hardness (with 
a constant as small as 1.001) are shown ll24l l26l l23ll . In fact, even for approximating the general bilevel 
program, only the constant ratio can be ruled out lfT7ll2"2l . 

We believe that an improvement to upper or lower bound of the problem might shed some light on 
approximating a larger subclass of bilevel programs, perhaps generating a new set of techniques for attacking 
the whole family of Stackelberg problems. (The problem seems to require a new technique due to its bizarre 
behavior.) 

Our result and techniques In this paper, we give the first result beyond a very small constant hardness: 
Theorem 1.1. For any e > 0, it is NP-hard to approximate StackSP to within a factor of2 — e. 

The key insight in obtaining this result comes from exploring the structure of the edge prices which 
was not exploited in the previous inapproximability results ll24l l26l l23l : The previous results encode the 
constraints in the constraint satisfaction problems (Max 3SAT in their cases) using certain gadgets and 
glue these gadgets together in a uniform way (i.e., using the same edge price throughout). However, we 
study the influence of non-uniform prices to the hardness of the resulting instances. In particular, we study 
how the prices of the fixed cost edges affect the hardness of the gadgets and found an optimal price which 
strikes a balance between being too high (which could hugely reduce the revenue but is easy to avoid) and 
too low (which is likely to be used but do not affect the revenue much). This observation, armed with a 
stronger constraint satisfaction problem (i.e., Raz verifier for Max 3SAT(5) ) and a right parameter of price, 
leads to a (2 — e)-hardness of approximation. The techniques above are strong enough that the hardness 
result is obtained with only a slight modification of the gadgets. However, due to the non-uniformity of 
the prices, a more sophisticated analysis is required. In particular, our analysis relies on a technique called 
Path Decomposition which breaks the shortest path in the optimal solution into subpaths with manageable 
structure. We will be able to get deeper into the intuition after we describe the hardness construction in the 
next section. 

Related work 

StackSP is first proposed by Labbe et al. Il24l who also derive a bilevel LP formulation of the problem and 
prove NP-hardness. On the algorithmic side, Roch et al. present the first, and still the best, approximation 
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algorithm which attains O (log to) approximation factor. Another O (log to) approximation algorithm is 
obtained by Briest et al, which has a slightly worse approximation guarantee (larger constant in front of 
log m term) but is simpler and applicable to a much richer class of Stackelberg pricing problems. Even 
though the algorithm of Briest et al. does not rely on the specific problems' structures, it remains unclear 
whether one can exploit a special structure of each problem to improve the approximation ratio. 

Another interesting problem in the family of one-follower Stackelberg network games is Stackelberg 
Minimum Spanning Trees Game (StackMST) in which the client aims to buy the minimum spanning tree 
instead of the shortest path. Cardinal et al. (H introduce this problem and prove that it is APX-hard but has 
an 0(log m) approximation algorithm. Very recently, they consider the special cases of planar and bounded- 
treewidth graphs (9l and prove that even in such graph classes, StackMST remains NP-hard. There are 
also many other variations in the family of Stackelberg games, depending on what the client wants to buy. 
This includes vertex cover f6]|5], shortest path tree Q, and knapsack Q. 

Among the known approximation algorithms, the most universal one is an 0((l+e) log m) -approximation 
algorithm invented by Briest et al. (H. This elegant algorithm works on a large class of problems, including 
StackSP and StackMST and is coupled with a simple analysis. In the same paper, the case of k clients is 
also considered. An 0((l + e)(logTO + logfc))-approximation algorithm is given, and the problem is shown 
to be hard to approximate within 0(log e to + log e k) for some large k. Therefore, the gap is almost closed 
in the case of many clients while left wide open when k is small (e.g. k is constant, and particularly when 
k = 1). 

In Economics and Operation Research literature, STACKS P is also known as a tarification problem. 
Many special cases are considered and polynomial-time algorithms are given for this problem ll24l l29l [T8l 
|27l[l0l[3j|6l. It is also sometimes called a bilevel pricing problem due to its connection to the bilevel linear 
program. (See a formulation in, e.g., (2M .) StackSP is also heavily studied from this perspective ll24l[T2l 
[l4l[T9l|2Tl|20l[T3l|4l. Approximating a solution of bilevel program to within any constant factor is shown 
to be NP-hard ll22l[T7l . Unfortunately, these reductions do not extend to the family of Stackelberg games 
due to specific structures of the constraints used in the reduction of |[22l[T7l . For more details, we refer the 
readers to dUEMTTl and references therein. 

Remark Recently Briest and Khanna Q discover a similar result to ours using a different approach. They 
show that StackSP is hard to approximate within a factor of 2 — o(l). 

Organization Our construction is a reduction from Raz verifier for Max 3SAT(5) . We first give an 
overview of Raz verifier in Section 12 We then describe our reduction in Section [3] before we are able to 
give more intuition behind the construction and its analysis. This will be done in SectionUl We then show a 
formal analysis in Section [5] 

2 Raz Verifier 

Our reduction uses the Raz verifier for Max 3SAT(5) with £ repetitions. We explain this framework in this 
section. The given instance of Max 3SAT(5) is a 3CNF formula with n variables and 5n/3 clauses where 
each clause contains exactly 3 different literals, and each variable appears in exactly 5 different clauses. 

Let e be a constant and let <p be an instance of Max 3SAT(5) . Then ip is called a YES-INSTANCE if there 
is an assignment that satisfies all the clauses, and it is called a No-lNSTANCE if any assignment satisfies at 
most (1 — effraction of the clauses. The following is a form of the PCP theorem. 
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Theorem 2.1. There is a constant e : < e < 1, such that it is NP-hard to distinguish between Yes- 
INSTANCE and NO-lNSTANCE of the Max 3SAT(5) problem. 

Raz verifier for Max 3SAT(5) with £ repetitions is a two-provers one-round interactive proof system. 
The verifier sends one query to each prover simultaneously. The first prover is asked for an assignment 
to the variables in the given clauses while the second prover is asked for an assignment of the variables 
that satisfies all the given clauses. The verifier will accept the answers if and only if both provers return 
consistent assignments. The detailed description of the provers-verifier actions is as follows. 

• The verifier first chooses £ clauses, say Ci, . . . , Cg, independently and uniformly at random (with re- 
placement). Next, choose one variable in each of these clauses uniformly at random. Let xi,. . . ,xg 
denote the resulting (not necessarily distinct) variables. 

• The verifier generates a query q consisting of the indices of C\ , Ci , . . . , Cg and a query q' consisting of 
the indices of x±, X2, ■ ■ ■ , xg. The verifier then sends q and q' to Prover 1 and Prover 2, respectively. 

• Prover 1 returns an assignment to all variables associated with clauses Ci, C2, . . . , Cg. 

• Prover 2 returns an assignment to variables x±, X2, ■ ■ ■ , xg. 

• The verifier reads the assignment received from both provers and accepts if and only if the assignments 
are consistent and satisfy C±, C2, . . . , Cg. 

Intuitively, for the Yes-Instance, both provers can ensure that the verifier always accepts by returning 
the satisfying assignments to the prover. On the other hand, any provers' strategy fails with high probability 
in the case of No-lNSTANCE. This is an application of the Parallel Repetition Theorem and Theorem 12.11 
and can be stated formally as follows. 

Theorem 2.2 ( H251 [TTl). There exists a universal constant a > (independent of £) such that 

• If ip is a YES-lNSTANCE, then there is a strategy of the provers that makes the verifier accepts with 
probability 1. 

• If 'ip is a NO-lNSTANCE, for any provers' strategy, the verifier will accept with probability at most 2~ ai . 

In our reduction, we view Raz verifier as the following constraint satisfaction problem. We have two 
sets of queries, Q\ and Q2, corresponding to all possible queries sent to Prover 1 and Prover 2, respectively. 
That is, Qi consists of all possible choices of £ clauses sent to Prover 1 (hence, |Qi| = (5n/3) e ) and Q2 
consists of all possible choices of £ variables sent to Prover 2 (hence IQ2I = n ). For each q G Q\ U Q2, 
let A(q) denote the set of all possible answers to q. Notice that |^4(g)| = 7 e if q G Q\ (since there are 7 
ways to satisfy each of the £ clauses given to Prover 1) and |^4(g)| = 2 £ if q 6 Q2 (since there are 2 possible 
assignment to each of the £ variables given to Prover 2). Denote by A\ and A2 the set of all possible answers 
by Prover 1 and Prover 2, respectively. 

We denote the set of constraints by <£. Each constraint in <3? corresponds to a pair (gi, 52) of queries sent 
by the verifier. That is, for each random string r of the verifier, there is a constraint (gi, ^2) £ Qi x 0,2 in 
$ where q± and q2 are queries sent to Prover 1 and Prover 2 respectively. A constraint (gi, g2) is satisfied 
if and only if the assignments to q\ and 52 are consistent. For convenience, we will treat <3? as the set of 
all possible random strings, and we denote, for each random string r, the corresponding queries by q\(r) 
and (ftM respectively. Note that each query q G Q\ is associated with 3^ constraints in and each query 
q' G Q2 with 5 constraints. Moreover, let M = |<3?|. We have M = (5n) e . The goal of this problem is to 
find an assignment / : 0\ — > A\ , Q2 — > A2 that maximizes the number of satisfied constraints in <5. 
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The following corollary can be directly obtained from Theorem 12.21 

Corollary 2.3. If (p is a Yes-Instance, then there is an assignment to Qi U Q2 such that all constraints 
in $ are satisfied. Otherwise, no assignment satisfies more than 2~ ai -fraction of the constraints in <£. 

3 The Reduction 

Let e > be a constant from Theorem ll.il Recall that we want to prove (2 — e)-hardness of approximation. 

Overview Starting with an instance tp of Max 3SAT(5), we first perform the two-prover protocol with 
£ = [log(3/e)/a] rounds, and we enumerate all possible constraints in <£. Next we transform <I> to an 
instance of the Stackelberg problem in two steps, as follows. In the first step of the reduction, we order the 
constraints in to get a (<5, j)-far sequence (see Section[3j])- In the second step, we convert such sequence 
to an instance of the Stackelberg problem, denoted by G, using the construction explained in Section [3^21 

3.1 Obtaining (5, 7) -far sequence 

Definition 3.1. ((5, 7)-far constraint sequence) Consider a sequence of all possible constraints r±, . . . , rjvf in 
<£. A constraint n is said to be S-far if for every j : i < j < i+ \5M~\ , q\(ri) 7^ Qi(rj) and 52 (rj) 7^ Q2(fj)- 
The sequence 77, . . . , tm is said to be (S, j)-far if at least (1 — 7)-fraction of constraints is 5-far. 

We can obtain (5, 7)-far sequence with the right parameter for our purpose using probabilistic arguments. 

Theorem 3.2. For any I > 1, 5 > 1/M and 7 > (85)5 , there is a polynomial-time algorithm A that 
outputs a (5, "i)-far sequence. 

Proof. We present a randomized algorithm here. In Appendix, we derandomize it to the desired A by the 
method of conditional expectation. Let r\ , r^-, . . . , tm be the constraints. Let A' be an algorithm that picks 
random a permutation n : [M] — > [M]. We claim that the sequence r n ^, . . . ,r % iM) is (5, 7) -far with 
probability at least 1/2. 

To prove the above claim, consider each constraint r^. Let J = {j £ [M] : qi(rj) = Qi(rj) or q2 (rj) = 
<?2(l)}- Notice that | J| < 3^ + 5 < 2 • 5 because there are S e constraints rj in $ with qi(rj) = gi(rj) and 
h l constraints rj in $ with q^iTj) = l2{ r i)- F° r ea °h such j £ J, the probability that | -tt (i) — 7r(j) | < \8M~\ 
is at most 25. By applying the Union bound for all such j € J, the probability that r^a) is not 5-far is at 
most (45) 5^ < 7/2. The expected number of constraints that are not <5-far is at most jM/2, so by Markov's 
inequality, the sequence is (5, 7) -far with probability at least 1/2, and the claim follows. □ 

3.2 The Construction 

Given a (5, 7) -far sequence of constraints n, . . . , Tm. we construct an instance of StackSP as follows. 
For each constraint rj, construct a gadget Gi containing source Sj, destination ti, and a set of intermediate 
vertices {uf, vf} aeA ^ qi ^ r .^y There are 2 • 7 such intermediate vertices (since |A(<7i(rj))| = 7 ). 

Recall that, for each answer a £ A(q\(ri)), there exists a unique consistent answer a' € A(q2(ri)). 
In other words, for each a 6 A(q\(ri)) there exists a unique a' £ A(q2(ri)) such that (a, a') satisfies the 
constraint rj. From now on, we will use 7Tj to denote the function that maps each a € A(qi(ri)) to its 
consistent answer a' € A(q2(ri)). Therefore, each pair of uf,vf corresponds to a pair of possible answer 
(a, 7Tj(a)) that satisfies rj. 

Edges in each gadget Gi are the following. 
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Figure 1: Example of graph G constructed from Max 2SAT (ii V x'2) A (xi V13) with £ = 1 repetition. Each gadget Gi is noted 
with the corresponding constraints n and each variable edge ufvf is noted with the corresponding answer from Prover 1. Note 
that the corresponding answer from Prover 2 can be identified easily. (For example, an edge u\ l0 ' l v[ 10 ' 1 corresponds to assigning 
xi — 1 and X2 = 0. Therefore, Prover 2's corresponding answer for i4 is x\ — 1.) The bigger picture is in Appendix. 



• Fixed cost edges: There is a fixed cost edge of cost 1 from Si to ti. There are also fixed cost edges of 
cost from Si to each of uf , and from each of vf to U. 

• Variable cost edges: There is a variable cost edge from uf to vf for each a G A{q\(ri)). 

Now we link all the gadgets together. First, for all 1 < i < M, we create a fixed cost edge of cost 
from ti to Si + \. We denote the source of instance s = si and the sink t = tju (i-e-, we want to buy a shortest 
path from si to ijvf)- 

Next, we add another set of fixed cost edges, called shortcuts, whose job is to put constraints between 
pairs of edges that represent inconsistent assignment. We only have shortcuts between far gadgets. (Gadget 
Gi is called afar gadget if its corresponding constraint r« is a <5-far constraint.) Consider any pair of far 
constraints r^, rj for i < j such that r; shares a query with Vj\ i.e., either q\(ri) = qi{rj) or q2(ri) = q2{fj)- 
If gi(rj) = qi(rj), we add a shortcut from to u a - 3 for every pair of a- t G A(q±(ri)) and a,j G A(q\{rj)) 
such that en 7^ ctj. For the case when 52(^1) = Q.2{fj), we add a shortcut from vf* to u^ J for every pair of 
a,, aj such that vrj(aj) 7^ -Kj{aj). We define the cost of this shortcut to be (j — i)/2. 

This completes the hardness construction. It is easy to see that the instance size is polynomial (for 
completeness, we add the proof in Appendix). 

4 Intuition and Overview of the Analysis 

Before we move on to the analysis, we explain the intuition behind the hardness construction in the previous 
section and the analysis in the next section. 

NP-hardness First, let us understand what happens when we apply the construction in Section |ll2l to Raz 
verifier's $ without applying Algorithm A (cf. Section [37X1) to get a (8, 7)-far sequence; in other words, the 
sequence of constraints is arbitrary. 

We use the following example to convey the idea. Consider a Max 2SAT instance with three variables 
X2, X3 and two clauses C\ = [x\ V X2) and C2 = (21 V X3). (For the sake of simplicity, we consider 
an instance of Max 2SAT instead of Max 3SAT.) The constraints of the Raz verifier with I = 1 repetition 
are r\ = (Ci, x\), T2 = (C\, X2), r% = (C2, £3), and = (C2, £1). If we construct the graph G from the 
sequence of constraints n, r2, r%, r\ according to the construction in Section [3721 then we will get the graph 
G as in Figure [TJ 

Consider any pricing p and let P be the corresponding shortest path from s to t. We classify the shortcuts 
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whose both endpoints are in P into two types, edges that are contained in P and edges that are induced by 
P, as follows. 



Definition 4.1. We say that P contains an edge e if e is an edge on P, and we say that P induces e if e is 

not an edge on P but both end vertices of e are on P. We say that P involves e if P contains or induces e. 

Observe that if P involves no shortcuts then we can construct a satisfying assignment from P. For 
example, a path siu^ tisiu^ v^t2Ssu^ \^ tss^u^ involves no shortcuts and could be 

converted to an assignment x\ = 1, xi = 1 and x% = 0. Conversely, a satisfying assignment of $ can also 
be converted to a solution (a price function) with respect to which the corresponding shortest path involves 
no shortcut edges. Moreover, observe that if P involves no shortcuts then we can get a revenue of M by 
setting price of all variable edges to 1 and we always get a revenue less than M otherwise. The following 
observation follows: $ has a satisfying assignment if and only if there is a solution that gives a revenue 
of M in the corresponding graph G. This observation, along with the reduction from Max 3SAT, already 
lead to the NP-hardness of StackSP. This is in fact the essential idea used in the previous hardness results 
El. 



Beyond NP-hardness To extend the above idea to a constant-hardness, we further observe an effect of 
the shortcuts on the revenue. In particular, we observe that if there are many "parts" of the shortest path that 
either contain or induce too many shortcuts then the revenue can be essentially at most M/2. To be more 
precise, let us first make the following two observations. 

First, observe that if P contains shortcuts ei, e2, e^, for some k, with costs ci, C2, Cfe then we can 
collect a revenue of at most M — Yli=i °i from P. This is because there is a path of length M from s to 
t and, for each i, once edge e, with fixed cost Cj is used, the revenue on P decreases by q. For example, 

the path P\ = sin^ 11 ^u[ 11 ^2 10 ^2 10 ^2S3ti3 1 °^t'3 10 ^4 11 ^4 11 ^4 contains two shortcuts v\ l v)^ and v^u^ 1 
of cost of 1/2 each. Therefore, any solution in which such path is the corresponding shortest path gives a 
revenue of at most 4 — 1/2 — 1/2 = 3. 

Secondly, consider when P induces a shortcut edge d from gadget Gi to gadget Gj with cost d and, for 
some reason, the edges in the gadgets Gi and Gj can have price at most 1 each. Then we can collect a revenue 
of roughly M — (j — i) + d + 2. This is because we cannot collect more than d + 2 on the subpath of P from 
gadget Gi to gadget Gj. For example, consider apath Pi = siu^ 1 ^^ tisiu^ 1 ^^ tis^u^vf 11 ^ tss^u^vf 11 ^ 
which induces a shortcut it 4 of cost I. For a pricing that P^ is the shortest path, we can collect a 
revenue of at most 3 for the following reason. First, we can collect at most 1 from edge u^v^ because 
edge s±ti would be used otherwise. Similarly, we can collect at most 1 from edge u 4 01 ^ujj° . Moreover, we 
can collect at most 1 from u^ 11 ^ and u^v^ altogether because the shortcut uf would be used 
otherwise. 

In summary, the observations above imply that a shortcut from gadget i to gadget j (either contained or 
induced) causes the revenue on the subpath from gadget Gi to gadget Gj to be bounded by (j — i)/2 + 2. 

The role of (5, 7) -far sequence Before we proceed to show the consequence of these observations, we 
would like to eliminate the effect of the the constant "+2" in the bound of the revenue above since it will 
be an obstacle in the analysis. In particular, to get the factor of 2 hardness, we would like to say that we 
can get a revenue of roughly ( j — i) jl and somehow conclude that the graph reduced from No-lNSTANCE 
gives a revenue of at most M/2. (Recall that we can get a revenue of M in Yes-Instance.) However, the 
constant +2 is a problem when j — i is small. 
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We eliminate the above effect in a straightforward way: instead of including the shortcuts for every 
constraint, we consider only the shortcuts with large cost (j — i)/2. The problem is, when we throw 
away some constraints, the constraint satisfaction problem becomes easier, and we should be able to satisfy 
more fraction of the constraints. We do not want this to happen. We want to somehow make sure that by 
neglecting a particular set of "bad" constraints, the soundness parameter does not grow by much. Roughly 
speaking, Section [3jJ shows that we can get the desired properties while the soundness parameter remains 
comparatively small. In particular, we lose an additive factor of 7 in the soundness parameter. (Please refer 
to Section[3j]for more details.) 

Getting 2-approximation hardness Now that we can eliminate the effect of the constant +2, let us see 

how we can use the above two observations to conclude the 2-approximation hardness. Intuitively, the two 
observations above imply that if the shortest path P involves many shortcuts then the revenue we can collect 
on P is essentially at most M/2. To prove this intuitive assertion, we argue in the next section that we can 
always decompose P into three types of paths - paths that look like Pi , paths that look like P2 and paths that 
can be converted to the solution for such that the number of satisfied constraints is equal to the number of 
variable cost edges in such paths altogether. This decomposition needs to be carefully designed to maintain 
the properties of the three types of paths and will be elaborated in Section 1531 

Using the above decomposition and the fact that paths of the first two types give a revenue of at most 
half of their lengths, we conclude that the revenue is at most M/2 + c where c is the number of edges in 
the paths of the third type. Using the fact that <i> is (5, 7)-far, we conclude that c is at most (7 + e/3)M 
where e is the constant as in Theorem ll.il By considering large enough n (and thus, large enough |<J>|) and 
choosing an appropriate value of 5 and 7 so that c < eM, we have that the revenue is at most (1/2 + e)M. 
This implies the gap of 2 — e, and Theorem [TTT] thus follows. We formalize these ideas in the next section. 

5 Analysis 

Now we prove Theorem [TTT] using the reduction in Section [3] Recall that e is a constant as in Theorem ll.il 
and we let £ = [log(3/e)/a] (where a is as in Theorem 12.2b . 5 = (e/10)5 - ^ and 7 = e/3. It follows that 
the soundness parameter of the Raz verifier is 2~ < e/3. (I.e., if tp is a NO-lNSTANCE, then at most e/3 
fraction of constraints in <I> can be satisfied.) 

In this section, we show that when the size of tp (denoted by n) is large enough, the reduction gives a 
(2 — e)-gap between the case when tp is satisfiable and when it is not. In particular, in section 1570 we show 
that if p is satisfiable, then there is a price function that collects a revenue of M. Moreover, in Section l5T2~1 we 
show that if tp is not satisfiable and n is large enough, there is no pricing strategy which collects a revenue 
of more than (1/2 + e)M. The value of n will be specified in Section I5T21 

5.1 Yes-Instance 

Let / : Qi — ► Ax, Q2 — ► A% be an assignment that satisfies every constraint in For gadget G{ corre- 
sponding to the variable r^, set price 1 to the edge from uf to vf for a = f (q± (rj)). Other variable cost 
edges in Gi are assigned the price of 00. We now show that we can collect a revenue of M in this case. 

Let P be the shortest path on this graph with respect to the above pricing. Notice that path P does not 
contain any shortcut since a shortcut only goes between two edges that represent inconsistent assignments. 
(I.e., if there is a shortcut from vp to u- J on P then either a, is not consistent with aj or 7Tj(aj) is not 
consistent with TTj(aj). Specifically, either gi(rj) = qx{rj) and a, / aj, or ^2(^1) = l2{fj) and 7Tj(aj) 7^ 
7Tj(aj). However, this is impossible since if qi(ri) = qi(rj) then aj = aj = f(qi(ri)) and, similarly, if 
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92 (n) = Q2(rj) then ^(oj) = 7Tj(oj) = 

Since the shortcut is not used, the length of P is exactly M. Moreover, observe that the path that uses 
all variable edges of price 1 also has length M. This path is a shortest path and gives a total revenue of M. 

5.2 NO-lNSTANCE 

We assume for contradiction that there is a pricing function which collects a revenue of (1/2 + e)M. Let p 
be such pricing function and let P be the corresponding shortest path. Our goal is to construct an assignment 
that satisfies more than eM/3 constraints in <£. This will contradict the soundness parameter e/3 of the Raz 
verifier. 

Definition 5.1. A subpath Q C P is said to be a source-sink subpath of P if it starts at some source and 
ends at some sink tj for i < j. For any source-sink subpath Q, denote by s(Q) and t(Q) the gadget index 
to which the source and sink of Q belong respectively. 

Now, let Q be any source-sink subpath and let Sj and tj be its source and sink, respectively. Let S = 
{Qi, ■ ■ ■ , Qk} be a set of source-sink subpaths of path Q. We say that S is a source-sink partition of path 

Q if s(Qi) = i, t(Qk) = j, and for all p < k, we have t(Q p ) + 1 = s(Q p+ i). 

The following theorem is the key idea to proving the result. 

Theorem 5.2 (Path Decomposition). Let p : E v — > R + U {0} be the optimal pricing of the variable edges 
and P be the corresponding shortest path in the graph. Then we can find sets TZ and TZ' such that the 
following properties hold. 

Dl. 7ZU7Z' is a source-sink partition of P. 

D2. The total revenue collected from edges on paths in TZ' is at most M/2 + 0(1/5). In other words, 

Y,e & E v niXJ Penl P)P^)<M/2 + 0{l/5). 
D3. The price of any variable cost edge in TZ is at most 1. That is, p(e) < lfor any e £ E v H (Upg7^ ^ > ) - 
D4. There is no shortcut between any two variable cost edges in 1Z. 

We defer the proof of this theorem to the next section. Meanwhile we show how the theorem implies 
that we can construct an assignment that satisfies more than e/3 fraction of the constraints in <£, thus a 
contradiction to the soundness parameter. First, we consider only when n is sufficiently large so that we can 
collect at most M/2 + 0(1/ 5) < M/2 + eM/3 from edges in 1Z' (from Property ID2l Consequently, at 
least 2eM/3 must be collected from edges in 1Z. 

Let E' be the set of all variable cost edges that lie on some paths in 1Z. From Property ID3I we have 
\E'\ > 2eM/3. Let F C E' be the set of edges in E' that lie in far gadgets. Recall that we have at most 
eM/3 gadgets that are not far (after we run an algorithm A in Theorem I3.2I ). so \F\ > eM/3. 

We are now ready to describe how we get an assignment that satisfies a large fraction of constraints in 
For each edge e G F, edge e can be written as u^v^ for some gadget i. We assign the answer a for 
query q\(ri) and 7Tj(a) for query q%(ri). This assignment satisfies the constraint n. This process satisfies at 
least eM/3 constraints corresponding to the edges in F provided that there is no conflict in assignment. 

We argue that there is no such conflict since there is no shortcut between the edges in F. I.e., assume 
that the above process creates a conflict assignment to the same query q. This means that there are two 
constraints n, rj € <I> for i < j with q = qi(ri) = qi(rj) or q = 92(^1) = 92 (fj) an d such query q was 
assigned different answers a* and aj when processing gadgets i and j. Since both rj and rj are far gadgets, 
by construction, there must be a shortcut between two vertices and u- 3 . This contradicts the fact that 
there is no shortcut in TZ. 
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5.3 Proof of Theorem 1531 



Consider any source-sink subpath Q. Since there is a fixed-cost path of length t(Q) — s(Q) + 1 from s s (q) 
to t t tQ\, the revenue collected on Q is at most t(Q) — s(Q) + 1, which will be denoted by len(Q). We let 
rev(Q) be the revenue collected on subpath Q, i.e. rev(Q) = X^egQn^ P( e )- Fi rst > observe the following 
lemma whose proof is simple and is deferred to Appendix. 

Lemma 5.3. IfS = {Qi, • • • , Qk} is a source-sink partition of Q, then X^=i len(Qj) = len(Q). 

We now explain the decomposition of the shortest path P (from Theorem 15.21 ) into several source-sink 
subpaths. Each subpath is contained in one of the sets TZ, S and T. In the end, we let TZ 1 in the Theorem 15 .21 
equal to T U S. The composition consists of two phases. We next describe each phase and prove the 
properties in Theorem 15.21 along the way. 

In the first phase, our goal is to make sure that TZ contains only source-sink subpaths that do not contain 
any shortcut. Initially, we set TZ, S, and T to TZ = {P}, and <S = T = 0. We then remove the portion of 
paths P which contains the shortcut edges and add them to set S. We ensure that paths are always cut into 
source-sink subpaths. In particular, we do the following. 

Phase 1: Initially, 71 = {P} and T = S = 0. While there exists a path P' G TZ that contains a shortcut 
edge, do the following. Let vv' be any shortcut edge. Let Sj be the last source vertex that appears before v 
in P' and let tj be the first sink vertex that appears after v' in P'. We note that i,j denote the gadget indices 
to which the vertices belong. First remove P' from 7Z. Denote by Q the source-sink subpath of P' from Sj 
to tj. We break P' into three (possibly empty) source-sink subpaths Qi, Q, and Q T ; (i) Qi starts at s(Q) 
and ends at vertex U-i, (ii) Q starts and ends at s, and tj, respectively, and (iii) Q r starts at tj + \ and ends at 
t(Q). We then add Q to S and add Qi,Q r back to 1Z. 

Consider the set TZ' = S U T. We show that, after this phase, the output satisfies properties ID II 
ID2I and ID3I After the second phase, property ID4I will be satisfied while other properties remain to hold. 
Observe that property ID 1 1 holds simply because the way we break path P' guarantees that s{Q\) = s(P'), 
t{Qi) + 1 = s(Q), t(Q) + 1 = s(Q r ), and t(Q r ) = t(P'). The next two lemmas prove properties |D3] and 

Lemma 5.4 (Property ID3b . After Phase 1, p(e) < lfor any variable edge e G E v that belongs to some path 
Q in TZ. 

Proof. Since path Q does not contain shortcuts, vertices Sj and U lie on Q for all s(Q) < i < t{Q). Recall 
that edge e can be written in the form UjVj for some j and a € A{q\ (rj)). If p(e) > 1, we can obtain a path 
shorter than P by using the fixed cost edge Sjtj of cost 1 instead of SjU a -v°jtj. This contradicts the fact that 
P is a shortest path. □ 

Lemma 5.5 (Property ID2K After the first phase, the revenue inlZ 1 = S UT is at most M/2 + 0(1/5). In 
particular, EQe5 rev ( ( 5) ^ \ (Y^Q&S len (Q)) +0{l/5). 

Proof. We will need the following claim. 

Claim 5.6. For each path Q G S, we have rev(Q) < (len(Q) + l)/2. 
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Proof. Consider path Q G S from Si to tj. Recall that there is a path of length len(Q) in G from to tj, 
so the total cost of Q is at most len(Q). It is, therefore, sufficient to prove that the total cost of the shortcuts 
contained in Q is at least (len(Q) — l)/2. The way we construct paths in S guarantees that path Q must be 
of the form 

where i\ = i,i q = j, and edges of the form u? x =>■ are the variable cost edges, from which we can 
collect a revenue. Other edges of the form vf x — > u, x+1 , for 1 < x < q, are shortcuts. Hence the total cost 

of shortcuts can be written as a telescopic sum, Ylx=i ( ^ x+1 ~ lx ^j = (j —i)/2= (len(Q) — l)/2. □ 

By the claim, £ Q&s rev(Q) ^ Eqgs (len(Q)/2 + 1/2) < \ {T,QeS len (Q))+\ S \/ 2 - It then suffices 
to bound the size of set S by 0( 1/5). Notice that each path in S contains at least one shortcut. Recall that, 
by the construction (cf. Section |3^2~1) . each shortcut only goes from vf to u a - if \j — i\ > 5M. Since the 
intervals in the set {[s(Q),t(Q)] : Q € S} are disjoint (by definition of source-sink partition), we can have 
at most 0(1/5) paths in S. □ 

This completes the description and the proof of Phase 1 . Now every path in 7Z contains no shortcut. 
In phase 2, our goal is to eliminate the shortcuts between paths in 1Z. (Note that these shortcuts are not 
contained in P.) Roughly speaking, we scan the gadgets from left to right and once we find such shortcut, 
we move the whole path that induces this shortcut to the set T. The detail is as follows. 

Phase 2: Initially, we have 1Z and S from Phase 1, and T = 0. We proceed in iterations starting from 
iteration 1. The description of iteration i is as follows: 

• We first check if source belongs to some path in 1Z. If not, we proceed to iteration i + 1. 

• If Si does belong to any path Q in 1Z, we do the following. We check if there is a shortcut (that is not 
contained in Q) leaving from some vertex vf 1 on Q to some vertex u- 3 on some path Q' € 1Z. Note that 
Q and Q' may be the same. Let ?' C Pbe the source-sink subpath from Sj to tj. We first remove from 
1Z and S, all paths Q" such that Q"C\P' ^ 0. Let Qi be the source-sink subpath of Q with s(Qi) = s(Q) 
and t(Qi) = s(P') — 1. Also, we let Q r be the source-sink subpath of Q' with s(Q r ) = t(P') + 1 and 
t(Q T ) = t(Q'). We add P' to T, and add Q t and Q r back to K. 

We now check the properties. Property ID 1 1 holds simply because, in each iteration, we remove only 
subpaths of what we will add (i.e., we may add paths Q, P' and Q' to 1Z and T and remove only subpaths 
of Q U P' U Q'). Since paths in 1Z only get chopped off, Lemma [5^41 still holds, and so does property ID3I 
Properties ID4I and ID2I follow from the following Lemmas whose proofs are in Appendix. 

Lemma 5.7 (Property ID4b . After Phase 2, there is no shortcut between any two subpaths in 1Z. 
Lemma 5.8 (Property |B2). E Q eT vev (.Q) < \ (Eq g t len (Q)) +0(1/8). 
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APPENDIX 



A Derandomization of Algorithm A' in Theorem 13.21 

Now we derandomize A' to get a deterministic algorithm A by the method of conditional expectation. 
Let Y denote the number of constraints that are not <5-far with respect to a random permutation ir. For 
a fixed permutation ir', let £{ir',I) be the event that ir agrees with ir' on set / (i.e., ir'(i) = ir(i) for all 
i € I). Notice that, we can efficiently compute E [Y \ £(ir', I)] for any ir' and / where the expectation is 
over random permutation ir. Therefore, for i = 1, 2, we deterministically pick the value of ir'(i) that 
maximizes the value of E [Y | £(ir', {1, . .. ,i — 1})]. 

B Construction Size 

We first calculate the size of each gadget G{. There are 0(7 e ) vertices and 0(7 £ ) edges for each gadget. 
Next, we count the number of shortcuts. For each pair of constraints rj and rj, there are at most 0{7 21 ) short- 
cuts between their intermediate vertices. Since there are (5n) £ gadgets, the graph size is at most 0{n)°^'. 



Since 



log(3AQ 



, the construction size is 0(n)°( 1//<E ) which is polynomial in n if e is a constant. 



C Omitted Proofs from Section |5] 
C.l Proof of Lemma H3| 

Ej=i len(Qj) = E^MQj) ~ <Qi) + 1) = KQk) - s(Qi) + 1 = t(Q) + 1 - s(Q) = len(Q) where the 
second equality is because t(Qj) + l = s(Qj+i) for all j < k and the third equality is because t(Qh) = t(Q) 
and s(Qi) = s(Q). 

C.l Proof of Lemma gj] 

Notice that once a shortcut leaving gadget i is found, the whole part of gadget i is removed completely from 
1Z. Therefore, after iteration i, there is no shortcut leaving the vertex in P n G, L to other vertices lying on 
some path in 1Z. (In fact, the vertex in P n G{ is not in any path in 1Z anymore.) 

C.3 Proof of Lemma B3E1 



Similarly to Claim 15^61 we can also bound the revenue on paths in T as summarized in the following claim 
whose proof can be found in Appendix. 

Claim C.l. For each path Q G T, we have rev(Q) < \ len(<5) + 2 

Proof. Consider path Q € T from Si to tj. Path Q can be written in the form: 

Si —y => v\ 1 — > . . . — > Uj 3 => Vj 3 — » tj. 

Note that we do not assume any structure of the path from to u* 3 . Also, recall that edges u^v^ and 
M j 3 v< j 3 were m ^ a ^ ter P nase 1 and moved to T in Phase 2. Moreover, there is a shortcut edge from vf l to 
u* 3 (which is not in Q). 

Now, let Q' be the subpath of Q from to u^ 3 , and e^, ej be the edges u^v^ and u?v? , respectively. 
Then Q = sieiQ'ejtj. The revenue collected on Q comes from edges in Q' and e,i and ej. Since both 
and ej belonged to some paths in TZ after Phase 1, we have p(ei) + p(ej) < 2 (cf. Lemma l5~4b - Path Q' 
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can collect revenue of at most (j —i)/2 due to the fact that there is a shortcut edge v^u^ of cost (j — i)/2. 
Overall, the revenue on Q is at most (j — i)/2 + 2 < \ len(<5) + 2. □ 

Since every path Q £ T induces some shortcut edges (i.e., there is a shortcut edge between some pairs 
of vertices in Q), the length of such path is at least SM. Therefore, \T\ < 0(1/5). We apply Claim ICTI for 
every path in T and sum them up. This immediately gives the lemma. 
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